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This study addressed issues raised in the literature 
on science and mathematics teacher certification testing concerning 
the validity of job analysis data and the test domain defined by the 
job analysis. More specifically, the issues addressed ars those 
a^socicited with race, gender, and age. Questionnaires were sent to 
2,801 mathematics and 2,468 science teachers or teacher supervisors 
identified by the Georgia Department of Education as certified in 
these fields. A total of 25 different forms of the Georgia Teacher 
Certification Test had bean developed to represent the fields of 
secondary certification (grades 7 through 12) in the state. The forms 
wer^- distinguished by task statements pertinent to particular 
subjects taught; the science form had 148 unique task statements, 
while the mathematics form had 160 such statements. For science and 
mathematics teachers, respectively, 1,384 and 1,600 usable responses 
were available. Teachers rated the task statements by indicating for 
each one whether they actually performed the task, its importance to 
the learning process, and the possibility of successful performance 
by minimally competent teachers. Task statement content clusters were 
identified as well as simple effects and group mean differences. 
Results indicate significant effects based on race for science 
teachers, but no other significant effects based on race, sex, or 
age. Fourteen data tables and six graphs present study data. (TJH) 
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Issues have been raised in the literature concerning the 
validity of job analysis data and the test domain defined by the 
job analysis. For example, it has bejen suggested that job 
content may not be independent of personal characteristics of job 
incumbents. In this work, teacher importance ratings of 
secondary science and mathematics job ar> lysis task statements 
were evaluated for possible response ditterences by gender, race 
and age. The results showed that while content subareas in both 
fields were rated of differential importance by teachers, in 
general there were not important group rating differences that 
would have lead to gender, race, or age bias in a definition of 
the test content domain based on the job analysis responses. 




Teacher Licensure Test Job Analysis Response by Gender, 
Race, and Age: Secondary Science and Mathematics 



A significant aspect of the current reform movement in 
education is the increasing uje of assessment procedures to 
evaluate prospective education students, education graduates 
seeking initial certification, and veteran certified teachers 
(Shulman, 1986; Mehrens, 1986 & 1987; Lehmann and Phillips, 
1987). In different states, this has resulted in a variety of 
assessments, including basic skills, professional or pedagogical 
knowledge, general knowledge, subject-matter knowledge, and 
performance ratings, in some states, examinations are produced 
locally; more commonly, states adopt for use tests produced by 
the Educational Testing Seryice (ETS) and other testing concerns 
such as National Evaluation Systems (NES) (Lehmann and Phillips, 
1987). 

Fundamental to all test construction and use is the 
demonstration of test validity, or the correctness of the 
interpretation of test scores (Cronbach, 1971). For licensure or 
certification tests, the appropriate validity strategy is the 
content validity approach (Kane, et al* 1989), which requires the 
establishment of job-relatedness of the content domain to be 
sampled by the test (APA, AERA, NCME, 1985; Mehrens, 1986 & 
1987). Except for tests developed nationally, such as the 
National Teachers Examination (NTE), and adopted after sta^.e- 
level validation studies are conducted (Cross, 1985), job- 
relatedness is generally based on the results of a job analysis 
that is conducted during test construction. Mehrens (1986) 
states that "What appears to be the most common and feasible 
approach for doing the job analysis is through a survey of the 
people in the profession" (p. 28). 

The Guidelines (Un iform Guidelines on Employee Selection 
Procedures . EEOC, 19781", although specifying the necessity for 
job analysis in validation / state: 

Any method of job analysis may be used if it provides the 
information required for the specific validation strategy 
used* (p. 38300) 

The Standards (11.1) ( Standards for Educational and Psychological 
Testing , AFA, 1985) also state "job analyses provide the basis 
for defining the content domain [in licensure and certification 
tests]" (p* 64), but do not specify any job analysis methodology. 
The other document that guides test standards^ the APA Principles 
for the Validation and Use of Personnel Selection Procedure s 
( 1 987) also provide no guidance in job analysis methodology or 
adequacy. For high--stakes tests such as the Georgia teacher 
certification tests (TCT), the decision of validity or invalidity 
is often resolved in court. The adequacy and validity of the job 
analysis is critical to the content validity argument and the 
inferences to be drawn from the test results (Elliot^ 1987). 



However, there is a dearth of research on job analysis in 
licensure testing, and few guidelines are available from 
personnel psychology or othev literature with which to frame an 
evaluation of job analysis adequacy and validity. 

An aspect of job analysis validity examined here involves 
investigating possible job analysis response differences 
associated with job incumbent characteristics (sex, race, age). 
The tepching fields considered were secondary science and 
mathematics. B< th are subject-matter content intensive teaching 
fields, and both have a relatively high proportion of male 
teachers. 

The Need For Job Analysis Evaluation 

The need for job analysis evaluation is expressed in the 
Standards : 

Probable sources of variance that n.ald confound the 
construct or domain definitions underlying the test 
should be investigated by the test developer, and the 
implicatii ns of the results for test design, 
interpretation, and use should be presented in the 
technical manual or in supplementary reports, (p. 28, 
Standard 3.12) 

Tenopyr (1986), commenting on the proliferation of job 
analysis techniques, notes: 

Although a universal job analysis system is not advoca-^ed, 
it appears that there is a need for developing some 
principles for analyzing jobs. Despite the large number of 
job analyses being done today, there does not appear to be 
available the research base from which the needed principles 
can be drawn.... The major question of the validity of the 
masses of data which have been generated is of utmost 
importance. .. .Various types of raters should be examined, 
e. g. , supervisors, incumbents, psychologists . Different 
specificity levels of construct should be employed. Studies 
to determine the degree of response style associated with 
such ratings should be undertaken. (p. 283) 

Darrett (1981) points out simply that "there is no agreement on 
what makes an adequate job analysis" (p. 586). 

Prien notes in his 1977 review that although job analyses 
have become increasingly important in selection procedures, there 
is still an "absence of research which defines the necessary and 
sufficient job analysis method" (p. 167). He identified basic 
issues in the context of content validity that require attention: 
1) job analysis reliability and validity, 2) job functions and 
individual differences, 3) the research designs needed to produce 
appropriate information, and 4) the sufficiency of job analysis 
information. 
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Guion (1978) raises other issues regarding fairness and bias 
in the content domain. He states that "the idea that content 
domain samples are inherently fair seems widespread" and that 
this is probably a correct idea because "carefully constructed 
content domain samples seem likely to be free from bias" (p. 

502) . However, he warns that this "assumption of fairness may be 
vulnerable at several points," one of which is the assumption 
that the "job content domain is independent of the 
characteristics of the people who hold the job" (p. 502). This 
may not be true if "actual job content differs in different 
subgroup^ of incumbents" (p. 502). If it is the case that 
"important and testable aspects of ^hat are actually two jobs 
[are] treated as one, then a test sampling of one is unfair to 
applicants for the other" (p. 503). This might be the case, 
according to Guion, when an affirmative action hiring produces 
"qualitatively different jobs for men and for women or for 
minority and nonminority employees" (p. 503). 

Guion (1978) also speculates that job content might not be 
independent of personal characteristics in positions that allow 
different styles of work. In this case: 

Over time, these differences may produce 
qualitatively different jobs. This gradual 
process of change can be described as "drift" in job 
definition. Groups of people who think a'^d behave 
alike and are in continuous communication rith each 
other may drift in common ways. If these differences 
are identified with racial differences, a cultural 
drift may occur. In -hat is supposed to be an 
increasingly intec- ced society, we paradoxically find 
.nore and more vr" tary social isolation of minorities. 
This sets a sta for d^-ift in different dixections for 
different cultu* xl groups. A parallel drift that has 
no connotation of race or sex may occur for people who 
use different competencies in achieving the same ends, 
(p. 503) 

If such drift is trivia], Guion goes on to say, it becomes a 
trivial issue in content domain definition. "However, a 
substantial problem of fairness could arise when the test content 
domain is substantially defined by purely stylistic elements 
irr€.lrvant to the actual quality of performance on the job" (p. 

503) . It is certainly the c^se that competent teaching allows 
for diversity of style. Madaus (1987) also raises many issues 
about the current validation methods employed in licensure test 
validation. He questions the validity of the expert judgments 
used to define the test domain and challenges researchers to 
generate and test disconf irming hypotheses about the validity of 
the test validation methods currently employed. 

In addition to these general questions about the validiLy of 
job analysis data, there are results of research on job ratings 
and other rating research indicating that specific response 
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differences have been observed or tbat response differences might 
be expected. In job analyses conducted to establish the job- 
relatedness of tests designed to measure pedagogical knowledge 
(Elliot, 1987) and non-subject-matter related job content 
(Potter, 1980), teacher rating responses to tcsk statements were 
different for educators who differed on job environment (Elliot, 
1987) or job context variables (Potter, 1980) such as grade 
levels and teaching environment. Outside variables such as 
experience and training background dxJi not produce significant 
response differences. 

c$emis, Belenky, and Soder ( 1983) note: 

While no particular job analysis method is prescribed, 
government regulations and court decisions clearly indicate 
a belief that job analysis will lead to fair job-related 
personnel practices. Very little research has been done on 
this belief or on other general job analysis questions. A 
preliminary study on this question conducted by ...[Boyles, 
Palmer, and Veres, 1980] indicates that blacks and whites 
may respond differently to judgmental jcb analysis 
questions, 'p. 144) 

Veres, Boyles, and Champion (1983) report that Boyles et al. 
(1980): 

Found significant differences between black and white 
subject matter experts (SMEs) on job ai^alysis ratings of 
clerical jobs. In this case, differences in job analysis 
ratings were not accompanied by similar differences in 
scoies on the selection device. Black applicants in fact 
scored highest ( vis a vis White applicants) on those areas 
of the selection test that black incumbents had rated lower 
than white incumbents on the job analysis, (p. 3) 

Veres et al. (1983) suggested that "these findings appear to 
preclude blind faith in the fairness assumption without further 
study" (p. 3). 

Veres et al. (1983) pointed out that Boyles et al. (1980) 
did not determine whether the differences in ratings were the 
result of different job content or different response ratings of 
the same job content. In a similar study, Veres found "no 
evidence of [significant] racial differenes in rating accuracy," 
(p. 6) but found "a number of inaccurate raters within each 
racial group" (p. 6) rating real and bogus tasks associated with 
their own jobs. These differences were greatest in ratings of 
criticality, job-entry preparedness, and the relationship of task 
performance and overall jc^^D performance. The differences in the 
ratings of these variables could lead to selection devices that 
are biased. 
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Rationale for This Study 



The reasons for evaluating the job analysis responses by 
race, gender, and age are as follows. First, in the fields of 
science and math, there is evidence of performance differences 
for male and female students on standardized tests. It is 
possible that science and math TCT content knowledge exams could 
result in adverse impact for female job applicants or current 
teachers. Pallas and Alexander (1983) state that it is: 

Well-established that by the end of high school there are 
large sex differences [favoring boys] in mathematical 
aptitude and achievement ... [and that this difference is] 
recurrent across countries and over time, and is obtained on 
various standardized tests, (p. 165) 

The authors cite mean differences of ^i1 to 51 points on SAT 
mathematics scores reported between 1970 and 1976, and a 35 point 
ditference in their own study. The data indicate that about 60% 
of the gap in quantitative performance is due to course 
differences in high school and the authors link this to 
differential sex-role socialization of boys and girls at earlier 
ages. The question here is whether the differ^^it socialization, 
the likely difference in coursework, and the residual male-female 
discrepancy in mathematics scores unexplained by coursework 
differences might cause male and female mathematics and science 
teachers to rate the importance of science and mathematics tasks 
differently on a job analysis. 

Second, the response by race should be evaluated for 
differences because TCTs are usually challenged in court as a 
result of adverse minority impact. In LULAC v. State of Texas 
(1985), a preliminary injunction (later overturned) was granted 
against the use of a basic skills test for undergraduates seeking 
to enroll in teacher education courses that adversely impacted 
minority students. One criticism that plaintiffs had regarding 
the validation study was that the survey responses to questions 
about adequacy of preparation for the test had not been broken 
down by race. Whpn there is adverse impact, demonstration of the 
representativeness of the sample will become an important issue 
if there are suspected differences in job analysis responses by 
race, region, gender, or some other classification considered 
arbitrary under Title VII or the 14th Amendme-^t. 

Third, a recent EEOC determination (EEOC, 1988) stated that 
the Texas education agencies charged in the complaint: 

have discriminated against Blacks ^.nd persons over 40 
years of age who took the 'TECAT [Texas Examination of 
Current Administrators and Teachers] in 1986 and were 
removed from their teaching positions as a result. 
Accordingly, we find that Title VII and the ADEA have 
been violated as to Charging party and all similarly 
situated individuals, (p. 2) 
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The ADEA (Age Discrimination in Employment Act of 1967) prohibits 
age 'iiscrimination in employment. Although amendments to the 
ADEA have raised and finally eliminated a maximum age for which 
protection applies, 40 has remained the minimum age at which 
protection from age discrimination begins (Fretz and Dudovitz, 
1987). Response differences associated with age of rater may 
contribute to bias in content domain definition. 

In summary, there is little guidance in the literature 
regarding the adequacy or quality of a job analysis. As job 
analysis has become more important to the validity argument of an 
increasing number of high-stakes tests, several questions have 
been raised in the literature regarding job analysis methodology, 
adequacy, and fairness. Questions have also been raised 
regarding the relationship of person-characteristics, including 
race, to job analysis responses. In the legal arena, where 
issues of test bias and adverse impact are discussed, the 
validity of the job analysis responses has become an issue in 
test-related court cases (Kuehn, et al. 1989). 



Analysis Samples 

The Georgia Teacher Certification Tests (TCT) are undergoing 
revision. As a first step in this process, the Georgia Job 
Analysis Questionnaire (JAQ) was distributed by Georgici 
Assessment Project (GAP) in January, 1987, to all certified 
personnel identified by the Georgia Department of Education. 
Questionnaires were sent to 2801 math and 2468 science teachers 
or teacher supervisors identified as certified in these fields. 
Twenty-five different questionnaire forms had been developed to 
represent the fields of certification, in the state. On each 
form, the first 54 task statements were the same and included 
teacher activities identified as common to the profession. The 
remainder of the task statements were related to the content of 
each fjsld. The task statements were written by GAP staff with 
the help of content experts and are based on a review of the 
literature in the field, the school curricula, and classroom 
observations. The science form had a total of 148 task 
statements end the math form had IbO statements. In addition to 




characteristics of both groups of respondents. 
. Georgia JAQ, teachers rated *-he task statements by 

indicating for each one whether they actually performed the task, 
how important it was to the learning process on a one to five 
scale (little, some, moderate, considerable, or great 
importance), and whether a minima] ly competent teacher should be 
aole to perform the task throughout job tenure. 

Task Statement Content Clusters 



Analyses were done on average ratings of clusters of related 



task statements. Gcience task statements were grouped into five 
broad categories by science teachers and teacher educators at a 
meeting held to begin revision of the current TCT. In the 
sorting done by the science content experts, task statements 
could fall into more than one category because of the 
interrelated nature of the science fields. For the purposes of 
this study, The 15 task statements that overlapped categories 
were eliminated and the remaining 79 that the content experts 
generally agreed represented one content area were retained. 
These fell into the following five content clusters: general 
scientific processes (17 tasks), biology (21 tasks), chemistry 
(14 tasks), physics (16 tasks), and earth science (11 tasks). 
These generally represent the curriculum in grades 7 through 12, 
the range covered by the secondary certificates. 

Revision meetings have not yet been held in secondary math 
so the categories indicated by two math teachers who worked in 
the development of the task statements were used to cluster math 
tasks and eliminate those that overlapped content categories. 
These teachers judged 80 of the tasks to fall into one of six 
categories: basic math concepts (21), algebra (19); geometry 
(10)/ trigonometry (8), calculus (5), and computers and 
programming (17). The remaining 26 task statements were not used 
in the analysis of the mean responses for the content clusters* 
The mean importance ratings for each content area are plotted in 
Figures 1 through 6. 

Multivariate analysis of variance profile analysis was used 
to evaluate rating differences of the mean importance of the 
science and math task clusters by the three independent variable 
groupings (Harris, 1985, and Nunnally, 1978). This allowed for 
tests of significance for group by content cluster interactions 
(parallelism test), group mean differences in importance ratings 
(main effects or levels test), and differences in the ratings of 
the content clusters (flatness test). The levels test, or test 
of group mean differences, is a univariate test and is reported 
as such. The other two tests in profile analysis (parallelism 
and flatness) are multivariate tests and the results are reported 
accordingly as Wilks >^ , its associated F, and its significance. 
Where significant main effects were found for groups, analysis of 
variance was used to discover the simple effects of group 
differences at each content cluster rating. Results of the 
MANOVAs and ANCVAs are reported in Tables 2 through 12 for 
science and math. These tables appear under the appropriate plot 
of mean responses for each variable. Table 13 summarizes the 
significant multivariate results found in the analyses in both 
fields. 

Because of the likelihood of finding significant results 
with the large sample sizes in this study, estimates of effect 
size were calculated where any significant result was found. The 
general formula suggested by Maxwell, Camp, and Arvey (1981) for 
estimating strength of association (omega squared) in factorial 
designs is: 



10 



SS effect - (df effect x MSw) 
SS total +~MSv 



For the calculation of W , the averaged univariate tests sums 
?.h?f if^ were used for the parallelism and flatness tests 
Table 14 summarizes the estimated effect size or variability 
accounted for for the significant results. ^oiin:y 
i= discussion of effect size, Cohen (1977) states that it 

iarae a^T?" ^Sk^^! behavioral sciences to see an effect size as 
iff!!^ • . ^® ^""^""^ °^ reference he suggests is that an 
effect size of .01 is considered small, .06 is medium, and .16 or 

?Se J^suUs'^o^'t^Mr^^'""^" '''^ ^" behavioral reselrcS! 

standard. ^ ^""^ interpreted according to this 



Interaction of Importance Ratings by Group 

As summarized in Table 13, the test of profile parallelism 
?or'a^rva^iJMp'°^"^ significant results (nonparall^l proLJes) 
■^able ^4 science. However, as shown in 

Table 14, the significant interactions did not have effect sizes 
that reached what Cohen defines as the small range. 

Main Effects; Group Mean Differences 

Daraneiit'°)^^^/!;-^^y!^^' ^^^^^ significant interactions (lack of 
parallelism) are found, the main effects (group differences) or 

aifS!f£ic:u\o^iSt"^^"*/^"i^^'^ °" ?ating'd!?nrences) 
^qIq ; S ^? interpret and may not be meaningful (Harris, 
na?nrp"of ^^"^^^^'J^^^^* However, because of the exploratory 

ef^n'^f""^ ^! no interaction, or if the interaction is 
significant but trivial [the case with these results], the 
outcome of the F tests involving the main effects can be 
interpreted without qualification. With a sizable and 

ti!=o i""!" °" hand, the meanings of 

these F tests must be interpreted with caution. 
I p . <s 1 1 ; 

Significant main effects or group (sex, race, age) 
differences were found for sex and race in science but for none 
of the group differences in math (Table 13). Table M shows that 
only the race difference (Black respondents rated tasks Mgh^ri^ 

JeKar^Sr^ ""^'^""^^ ^ meaningful effect size! Othef 
researchers have reported similar results. Veres (1983) found 

^a^edlhrtaskforMar'^' differences in ratings, BlackH^so 
ratea the tasks of higher importai.ee. Rosenfeld et al. (1 986 ) 

?Sb?e 4rbat'd?^"^%°^ tasks by race (Appendix j' 

Table 4) but did not test differences for significance. On all 
variables, Bxack teachers rated the tasks of^igh" fmportanSe 
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than did White teachers. Elliot (1987) did not report importance 
ratings and did not report analyses of the time-spent ratings by 
incumbent characteristics such as race, sex, or age. 

In both fields, significant differences in mean content 
cluster ratings were found. There is a definite hierarchy of 
importance of the content of each field. In both fields, the 
basic or core content is rated the highest in importance. The 
content taught with less frequency is rated lower in importance. 
These response differences could be due to a perception of 
importance to student learning based on the small number of 
students who actually study these higher level subjects or the 
differences could be due to the teachers' familiarity with the 
content. If teachers are less knowledgeable about the higher 
level course content, they may be rating it lower in importance 
for that reason. As shown in Table 14, the differences in 
science content rating show a small effect size ((0*^= .03) while 
thv math content ratings show a moderate effect size ( co*^ = .14). 

Simple Effects 

To explore the source of group rating differences in each 
content area, tests of simple effects of each group were 
conducted using oneway ANOVA where the profiles were nonparallel 
and the overall group means were significantly different. To 
avoid the proliferation of Type I error that can be the result of 
multiple tests, the decision was made to suspend judgment on the 
significance of Fs that fell between .05 and an adjusted alpha 
level (Keppel, 1982, p. 163). Using the Bonferroni adjustment 
(Harris, 1985, p. 8), the critical alpha is set at .05 divided by 
the number of univariate tests conducted, or .01 for science and 
.0083 for math. Significance is indicated by probabilities 
smaller that these adjusted levels, judgment is suspended for 
probabilities in the corrected alpha to the .05 range, and 
nonsignif icance is concluded for probabilities greater than .05. 
The results are reported in this manner in the tables associated 
with these tests. 

Inspection of the results of these tests of simple effects 
for science (Tables 3 and 5) does not yield any generalizations 
about the content areas in which differences in importance 
ratings lie. In other words, the likelihood of any particular 
content area being rated differently by one group or another does 
not seem to be related to the frequency with which that content 
is taught. For race in science, the only variable with a 
meaningful effect size, all the simple effects were significant 
except for the earth science rating. 

In math, the simple effects tests are reported in Tables 8, 
10, and 12. All the math interactions were significant. Females 
rated the more basic math courses higher than males. The ratings 
were not different for the higher level math courses. For race, 
there is no clear pattern of differences and for the age 
categories, older teachers rated only geometry higher in 
importance* 
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Conclusions 

The results show that the content areas rated by teachers on 
the questionnaire vary in their relative importance either 
because of teacher f ami] * rity with the content or because of the 
frequs^ncy with which the ontent is taught by the teachers. 

Regarding group differences and interactions between the 
groups and the content ratings, significant results were found 
for sex and race in science. However, when effect size was 
estimated usi.ig Co*- , meaningful results were found only for race 
in the scieni:e ratings. No important sex rating differences were 
found although review of the literature showed that there is 
evidence of differential mastery of science and math content and 
differential test performance for males and females in these 
areas. There are no important age effects in the ratings. 

Because of the potential for adverse impact in selection 
procedures, racial differences in the ratings are of major 
concern. The test content domain, as defined by the job analysis 
responses, appears not to be biased. Of particular importance is 
the demonstration that Black job incumbents on average do not 
rate job content task statements lower in importance than White 
job incumbents, "ontent domain definition based on importance 
ratings would not yield test content considered unimportant by 
minority test-takers. The validity of the content domain and of 
the job analysis is supported by the absence of important group 
response differences. 
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Table 1 

Gender. Race, and Age Breakdown for Questionnaire 
Respondents in Percentages 





Science 


Math 


Gender 






Male 
Female 


39% 
61 


27% 
73 


Race 


Black 


23 


18 
82 


White 

Age 


77 


< 40 
> 40 


61 

39 


64 
36 
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Tlfur* 1. SoUnM: IfMtn Uik ntlnga by 
6t 




3- 



s 

t 

o 



lUlA (11-621) O — O 
Famala (xi-B22) 



general biology chemistry phyaica 

content area 



— I 

earth 
science 



Table 2 

Profile Analysis for Science Content by Sex 



Test 



Wilks X MS MS error 



df 



Sig. 



Parallelism .991 
Flatness .722 
Levels * 4.743 



2.987 
122.702 
.579 8.187 



4,1275 
4, 1275 
1 , 1278 



.018 
.000 
. 004 



* univariate test 



Table 3 

Univariate Tests for Simple Effects of Sex 



Content Area 


F 


df 


Sig. 


Decision 


General 


1 1 . 1 00 




.001 


** 


B: oiogy 


22.258 




.000 


** 


Chemistry 


3.269 




.071 


NS 


Physics 


.735 




.391 


NS 


Earth Science 


6.206 




.013 


* 



* suspend judgment 
** significant 
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FIfur* 2. Sdenee: 
6t 



general 



tuk ntlnga bj r«ee. 

Black (n>«310) O — o 
IhiU (n«1042) e — C 
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biology chemiitiy phyiicB 
content area 



earth 
■cience 



Table 4 

Profile Analysis for Science Content by Race 



Test 



Wilks X MS MS error 



df 



Siq. 



Parallelism .988 
Flatness .719 
Levels * 1 1 .532 



3.782 
125.442 
.577 19.973 



4, 1286 
4, 1286 
1 ,1289 



.005 
.000 
.000 



* univariate test 



Table 5 

Univariate Tests for Simple Effects of Race 



Content Area 


F 


df 


Siq. 


Decision 


General 


32.286 




.000 


** 


Biology 


29.508 




.000 


** 


Chemistry 


21 .218 




.000 


** 


Physics 


7.602 




.006 


** 


Earth Science 


3.438 




.064 


NS 



* suspend judgment 
significant 



(J 



FIfur* 3. Seicne«: Unn tuk ratins bj aft. 




general biology chemltlry phTiici earth 

content area ■cience 



Table 6 

Profile Analysis for Science Content by Age Category 



Test 


Wilks X 


MS 


MS error 


F 


df 


Sig. 


Parallelism 


.994 




2 


.058 


4,1315 


.084 


Flatness 


.715 




130 


.760 


4,1315 


.000 


Levels * 


1 


.258 


.585 2 


.150 


1,1318 


.143 



* univariate test 
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rifun 4. lUttMmatiei: 

6t 



IS 

o 



taak ntiiift bj 1 

Mala (n»426) O — o 
Female (n-1137) 



I 




2 - 



' » 1 1 t 

Me algebra geometry trlf. cUculue computera 



content area 

Table 7 

Profile Analysis for Math Content by Sex 



Test 



Wllks ^ MS MS error 



df 



Slq. 



Parallelism .981 
Flatness .363 
Levels * 2.043 



5.523 5,1417 
496.751 5,1417 
.647 3.159 1,1421 



.000 
.000 
.076 



* univariate test 
Table 8 



Univariate Tests for Simple Effgcts of Sex 



Content Area 



df 



Siq, 



Decision 



Basic 

Algebra 

Geometry 

Trigonometry 

Calculus 

Computers 



30.561 
6.574 
4.416 
2.556 
.041 
.001 



.000 
.010 
.036 
.110 
.840 
.975 



* 
* 

NS 
NS 
NS 



NS not significant 
* suspend judgment 
** significant 



9i 



Tigarm 6. Mathematioi: Mma Uak nUngt by rmee. 



Black (n-2BB) O — O 
VhiU (n-i279) • — • 




buie algebra geometrj trig. 

eontent area 



4— 1 

calctUua eoxnputen 



Table 9 

Profile Analysis for Math Content by Race 



Test 



Wilks X MS MS error 



df 



21 



Sig 



Parallelism .972 
Flatness .366 
Levels * .030 



8.158 5,1424 .000 
494.368 5,1424 .000 
.650 .046 1,1428 .830 



* univariate test 
Table 10 

Univariate Tests for Simple Effects of Race 



Content Area 



df 



Big. 



Basic 

Algebra 

Geometry 

Trigonometry 

Calculus 

Computers 



7.156 
.992 
.418 
6.81 9 
4.065 
7.399 



.008 
.31 9 
.518 
.009 
.044 
.007 



Decision 



** 

NS 
NS 
* 
* 

** 



NS not significant 
* suspend judgment 
** significant 
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Ftture 6. Mathematlei: U—n Uak ntlnf by tfe 

6t i40(a-1026)O 



— O 



> 40 (n-676) • — • 




baaie algebra geometrj trig. eolculua computers 

eont-ent area 

Table 11 

Profile Analysis for Math Content by Age Category 



Test 



Wilks X MS MS error 



df 



Sig. 



Parallelism .988 
Flatness .367 
Levels * .180 



3.431 5,1449 .004 
499.193 5,1449 .000 
.649 .278 1,1453 .598 



* univariate test 
Table 12 

Univariate Tests for Simple Effects of Age Category 



Content Area 



df 



Sig. Decision 



Basic 

Algebra 

Geometry 

Trigonometry 

Calculus 

Computers 



.137 
1 .863 
4.270 
.733 
.018 
.424 



,712 
172 
.039 
.392 
.895 
.515 



NS 
NS 
* 

NS 
NS 
NS 



NS not significant 
* suspend judgment 
** significant 



23 



23 

Table 13 

Summary of Significant Profile Analysis Results 

Interaction Group Subarea 

Difference Difference 



Incumbent 
Characteristics 



Science 



Sex * * * 

Race * * * 

Age 



Math 

Sex * * 

Race * * 

Age * * 

p < .05 



Table 14 

Estimat 
Results 



2 

Estimates of Variability Accounted For ( OD ) for Significant 



Interaction Group Content Area 

Incumbent Characteristics 

Sex Science .00067 .00399 .0329 * 

Math .00077 - .1432 

Race Science .00064 .0103 * .0333 * 

Math .00278 - .1437 ** 

Age Science - - .0342 * 

Math .00021 - .1434 ** 

* small effect size 

medium effect size 
Cohen (1977) 
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